Greg Detre
Tuesday, October 01, 2002
�WordNet is
an on-line lexical reference system whose design is inspired by current
psycholinguistic theories of human lexical memory. English nouns, verbs, and
adjectives are organized into synonym sets, each representing one underlying
lexical concept. Different relations link the synonym sets.�
�The price
of imposing this syntactic categorization on WordNet is a certain amount of
redundancy that conventional dictionaries avoid�words like back, for example,
turn up in more than one category. But the advantage is that fundamental
differences in the semantic organization of these syntactic categories can be
clearly seen and systematically exploited. As will become clear from the papers
following this one, nouns are organized in lexical memory as topical
hierarchies, verbs are organized by a variety of entailment relations, and
adjectives and adverbs are organized as N-dimensional hyperspaces. Each of
these lexical structures reflects a different way of categorizing experience;
attempts to impose a single organizing principle on all syntactic categories
would badly misrepresent the psychological complexity of lexical knowledge.�
�Lexical
semantics begins with a recognition that a word is a conventional association
between a lexicalized concept and an utterance that plays a syntactic role.�
�Since the
word ��word�� is commonly used to refer both to the� utterance and to its associated concept, discussions of this
lexical association are vulnerable to terminological confusion. In order to
reduce ambiguity, therefore, ��word form�� will be used here to refer to the
physical utterance or inscription and ��word meaning�� to refer to the
lexicalized concept that a form can be used to express. Then the starting point
for lexical semantics can be said to be the mapping between forms and meanings
(Miller, 1986). A conservative initial assumption is that different syntactic
categories of words may have different kinds of mappings.�
constructive
vs differential definitions (in place of meaning)
�These synonym sets (synsets) do not explain
what the concepts are; they merely signify that the concepts exist. People who
know English are assumed to have already acquired the concepts, and are
expected to recognize them from the words listed in the synset.�
�A lexical matrix, therefore, can be
represented for theoretical purposes by a mapping between written words and
synsets. Since English is rich in synonyms, synsets are often sufficient for
differential purposes. Sometimes, however, an appropriate synonym is not
available, in which case the polysemy can be resolved by a short gloss,�
synonymy:
�According to one definition (usually
attributed to Leibniz) two expressions are synonymous if the substitution of
one for the other never changes the truth value of a sentence in which the
substitution is made. By that definition, true synonyms are rare, if they exist
at all. A weakened version of this definition would make synonymy relative to a
context: two expressions are synonymous in a linguistic context C if the
substitution of one for the other in C does not alter the truth value.�
�That is to say, if concepts are represented by
synsets, and if synonyms must be interchangeable, then words in different
syntactic categories cannot be synonyms (cannot form synsets) because they are
not interchangeable. Nouns express nominal concepts, verbs express verbal
concepts, and modifiers provide ways to qualify those concepts. In other words,
the use of synsets to represent word meanings is consistent with
psycholinguistic evidence that nouns, verbs, and modifiers are organized
independently in semantic memory.�
they take synonymy (even when defined in terms
of substitutability in truth-conditions, is a continuum, and symmetrical)
antonymy
�Antonymy is a lexical relation between word
forms, not a semantic relation between word meanings. For example, the meanings
{rise, ascend} and {fall, descend} may be conceptual opposites, but they are not
antonyms; [rise/fall] are antonyms and so are [ascend/descend], but most people
hesitate and look thoughtful when asked if rise and descend,orascend and fall,
are antonyms. Such facts make apparent the need to distinguish between semantic
relations between word forms and semantic relations between word meanings.
Antonymy provides a central organizing principle for the adjectives and adverbs
in WordNet, and the complications that arise from the fact that antonymy is a
semantic relation between words are better discussed in that context.�
hyponymy
�hyponymy/hypernymy is a semantic relation
between word meanings: e.g., {maple}is a hyponym of {tree}, and {tree} is a
hyponym of {plant}.�
�For example, maple inherits the features of
its superordinate, tree, but is distinguished from other trees by the hardness
of its wood, the shape of its leaves, the use of its sap for syrup, etc. This
convention provides the central organizing principle for the nouns in WordNet.�
�Another relation sharing these advantages�a semantic
relation�is the part-whole (or HASA) relation, known to lexical semanticists as
meronymy/holonymy. A concept represented by the synset {x, x�,...} is a meronym
of a concept represented by the synset {y, y�,...} if native speakers of
English accept sentences constructed from such frames as A y has an x (as a
part) or An x is a part of y. The meronymic relation is transitive (with
qualifications) and asymmetrical (Cruse, 1986), and can be used to construct a
part hierarchy (with some reservations, since a meronym can have many
holonyms). It will be assumed that the concept of a part of a whole can be a
part of a concept of the whole, although it is recognized that the implications
of this assumption deserve more discussion than they will receive here.�
�These and other similar relations serve to
organize the mental lexicon. They can be represented in WordNet by
parenthetical groupings or by pointers (labeled arcs) from one synset to
another.�
Wordnet
incorporates inflectional morphology
Abstract:
�Distinguishing features are entered in such a
way as to create a lexical inheritance system, a system in which each word
inherits the distinguishing features of all its superordinates. Three types of
distinguishing features are discussed: attributes (modification), parts
(meronymy), and functions (predication), but only meronymy is presently
implemented in the noun files. Antonymy is also found between nouns, but it is
not a fundamental organizing principle for nouns. Coverage is partitioned into
twenty-five topical files, each of which deals with a different primitive
semantic component.�
�In terms
of coverage, WordNet�s goals differ little from those of a good standard
handheld collegiate-level dictionary.�
WordNet does not aim to cover proper nouns
�a good dictionary is a remarkable store of
information: � different kinds of information packed into lexical entries:
spelling, pronunciation, inflected and derivative forms, etymology, part of
speech, definitions and illustrative uses of alternative senses, synonyms and
antonyms, special usage notes, occasional line drawings or plates� etc.
the underlying logic of a dictionary:
superordinate plus distinguishers
but if someone asks how to improve a dictionary
�
�What is missing from this definition [e.g. of
�tree�: felicitous�a large, woody, perennial plant with a distinct trunk,]? Anyone educated to expect this kind of thing in a dictionary will not
feel that anything is missing. But the definition is woefully incomplete. It
does not say, for example, that trees have roots, or that they consist of cells
having cellulose walls, or even that they are living organisms. Of course, if
you look up the superordinate term, plant, you may find that kind of
information�unless, of course, you make a mistake and choose the definition of
plant that says it is a place where some product is manufactured. There is, after all, nothing in the
definition of tree that specifies which sense of plant is the appropriate
superordinate. That specification is omitted on the assumption
that the reader is not an idiot, a Martian, or a computer. But it is
instructive to note that, even though intelligent readers can supply it for
themselves, important information about the superordinate term is missing from
the definition.�
�Second, this definition of tree contains no
information about coordinate terms. The existence of
other kinds of plants is a plausible conjecture, but no help is given in
finding them.�
�Third, a similar challenge faces a reader who
is interested in knowing the different kinds of trees. � The prototypical definition points upward, to a superordinate term, not
sideways to coordinate terms or downward to hyponyms.�
�Fourth, everyone knows a great deal about
trees that lexicographers would not include in a definition of tree. For example, trees have bark and twigs, they grow from seeds, adult
trees are much taller than human beings, they manufacture their own food by
photosynthesis, they provide shade and protection from the wind, they grow wild
in forests, their wood is used in construction and for fuel, and so on. Someone
who was totally innocent about trees would not be able to construct an accurate
concept of them if nothing more were available than the information required to
define tree. A dictionary definition draws some important distinctions and
serves to remind the reader of something that is presumed to be familiar
already; it is not intended as a catalogue of general knowledge. There is a
place for encyclopedias as well as dictionaries.�
�Note that much of the missing information is
structural, rather than factual. That is to say,
lexicographers make an effort to cover all of the factual information about the
meanings of each word, but the organization of the conventional dictionary into
discrete, alphabetized entries and the economic pressure to minimize redundancy
make the reassembly of this scattered information a formidable chore.�
�Since
words are used to define words, how can lexicography escape circularity?�
�The fundamental design that lexicographers try
to impose on the semantic memory for nouns is not a circle, but a tree (in the
sense of tree as a graphical representation). It is a defining property of tree
graphs that they branch from a single stem without forming circular loops.�
Relations:
The semantic relation that is represented above
by �@�� has been called the ISA relation, or the hypernymic or superordinate
relation
The inverse semantic relation �~�� goes from
generic to specific (from superordinate to hyponym) and so is a specialization.
Caveats:
�It should be noted, at least parenthetically,
that WordNet assumes that a distinction can always be drawn between synonymy
and hyponymy. In practice, of course, this distinction is not always clear�
�somewhere a line must be drawn between lexical
concepts and general knowledge, and WordNet is designed on the assumption that
the standard lexicographic line is probably as distinct as any could be�
�Since
WordNet is supposed to be organized according to principles governing human
lexical memory, the decision to organize the nouns as an inheritance system
reflects a psycholinguistic judgment about the mental lexicon. What kinds of
evidence provide a basis for such decisions?�
�The isolation of nouns into a separate lexical
subsystem receives some support from clinical observations of patients with
anomic aphasia. After a left-hemisphere stroke that affects the ability to
communicate linguistically, most patients are left with a deficit in naming
ability (Caramazza and Berndt, 1978). In anomic aphasia, there is a specific
inability to name objects. When confronted with an apple, say, patients may be
unable to utter ��apple,�� even though they will reject such suggestions as
shoe or banana, and will recognize that apple is correct when it is provided.
They have similar difficulties in naming pictured objects, or in providing a
name when given its definition, or in using nouns in spontaneous speech. Nouns
that occur frequently in everyday usage tend to be more accessible than are
rarely used nouns, but a patient with severe anomia looks for all the world
like someone whose semantic memory for nouns has become disconnected from the
rest of the lexicon. However, clinical symptoms are characterized by great variability
from one patient to the next, so no great weight should be assigned to such
observations.
�Psycholinguistic evidence that knowledge of
nouns is organized hierarchically comes from the ease with which people handle
anaphoric nouns and comparative constructions. (1) Superordinate nouns can
serve as anaphors referring back to their hyponyms. For example, in such
constructions as He owned a rifle, but the gun had not been fired, it is
immediately understood that the gun is an anaphoric noun with a rifle as its
antecedent. Moreover, (2) superordinates and their hyponyms cannot be compared
(Bever and Rosenbaum, 1970). For example, both A rifle is safer than a gun and
A gun is safer than a rifle are immediately recognized as semantically
anomalous. Such judgments demand an explanation in terms of hierarchical
semantic relations.
�More to the point, however, is the question:
is there psycholinguistic evidence that people�s lexical memory for nouns forms
an inheritance system? The first person to make this claim explicit seems to
have been Quillian (1967, 1968). Experimental tests of Quillian�s proposal were
reported in a seminal paper by Collins and Quillian (1969), who assumed that
reaction times can be used to indicate the number of hierarchical levels separating
two meanings. They observed, for example, that it takes less time to respond
True to ��A canary can sing�� than to ��A canary can fly,�� and still more time
is required to respond True to ��A canary has skin.�� In this example, it is
assumed that can sing is stored as a feature of canary, can fly as a feature of
bird, and has skin as a feature of animal. If all three features had been
stored directly as features of canary, they could all have been retrieved with
equal speed. The reaction times are not equal because additional time is
required to retrieve can fly and has skin from the superordinate concepts.
Collins and Quillian concluded from such observations that generic information
is not stored redundantly, but is retrieved when needed. (In WordNet, the
hierarchy is: canary @� finch @� passerine @� bird @� vertebrate @� animal, but
these intervening levels do not affect the general argument that Collins and
Quillian were making.)
�Most psycholinguists agree that English common
nouns are organized hierarchically in semantic memory, but whether generic
information is inherited or is stored redundantly is still moot (Smith, 1978).
The publication of Collins and Quillian�s (1969) experiments stimulated
considerable research, in the course of which a number of problems were raised.
For example, according to Quillian�s theory, robin and ostrich share the same
kind of semantic link to the superordinate bird, yet ��A robin is a bird�� is
confirmed more rapidly than is ��An ostrich is a bird�� (Wilkins, 1971). Or,
again, can move and has ears are both properties that people associate with
animal, yet ��An animal can move�� is confirmed more rapidly than is ��An
animal has ears�� (Conrad, 1972). From these and similar results, many
psycholinguists concluded that Quillian was wrong, that semantic memory for
nouns is not organized as an inheritance system.
�An alternative conclusion�the conclusion on
which WordNet is based�is that the inheritance assumption is correct, but that
reaction times do not measure what Collins and Quillian, and other
experimentalists assumed they did. Perhaps reaction times indicate a pragmatic
rather than a semantic distance�a difference in word use, rather than a
difference in word meaning (Miller and Charles, 1991).�
you can
either put some semantically-empty abstract component at the top of the
hierarchy or partition the nouns with a set of semantic primes (generic
concepts, each at the top of separate hierarchies)
{act,
action, activity}
{animal, fauna}
{artifact}
{attribute, property}
{body, corpus}
{cognition, knowledge}
{communication}
{event, happening}
{feeling, emotion}
{food}
{group, collection}
{location, place}
{motive}
{natural object}
{natural phenomenon}
{person, human being}
{plant, flora}
{possession}
{process}
{quantity, amount}
{relation}
{shape}
{state, condition}
{substance}
{time}
�The
problem, of course, is to decide what these primitive semantic components
should be. � One important criterion is that, collectively, they should provide
a place for every English noun.�
�These hierarchies vary widely in size and are
not mutually exclusive�some cross-referencing is required�but on the whole they
cover distinct conceptual and lexical domains. They were selected after
considering the possible adjective-noun combinations that could be expected to
occur� (???) (Johnson-Laird)
the 25 categories could be partly grouped in a
top level (e.g. into living/non-living)
�Lexical inheritance systems, however, seldom
go more than ten levels deep, and the deepest examples usually contain
technical levels that are not part of the everyday vocabulary.�
�These
hierarchies of nominal concepts are said to have a level, somewhere in the
middle, where most of the distinguishing features are attached.�
�Above the basic level, descriptions are brief
and general. Below the base level, little is added to the features that
distinguish basic concepts. These observations have been made largely for the
names of concrete, tangible objects, but some psycholinguists have argued that
a base or primary level should be a feature of every lexical hierarchy (Hoffman
and Ziessler, 1983).�
�It must be
possible to associate canary appropriately with at least three different kinds
of distinguishing features (Miller, in press):
(1) Attributes: small, yellow (adjectives)
(2) Parts: beak, wings (nouns)
(3) Functions: sing, fly� (verbs)
In 1993, �only the pointers to parts, which go
from nouns to nouns, have been implemented.�
�As more
distinguishing features come to be indicated by pointers, these glosses should
become even more redundant. An imaginable test of the system would then be to
write a computer program that would synthesize glosses from the information
provided by the pointers.�
�adjectives
are said to modify nouns, or nouns are said to serve as arguments for
attributes: Size(canary)=small�
this relation is not symmetric
�Here it is sufficient to point out that the
attributes associated with a noun are reflected in the adjectives that can normally
modify it. For example, a canary can be hungry or satiated because hunger is a
feature of animals and canaries are animals, but a stingy canary or a generous
canary could only be interpreted metaphorically, since generosity is not a
feature of animals in general, or of canaries in particular.�
�Keil (1979, 1983) has argued that children
learn the hierarchical structure of nominal concepts by observing what can and
cannot be predicated at each level. For example, the important semantic
distinction between animate and inanimate nouns derives from the fact that the
adjectives dead and alive can be predicated of one class of nouns but not of
the other.�
�The
part-whole relation between nouns is generally considered to be a semantic relation,
called meronymy (from the Greek meros, part; Cruse, 1986), comparable to
synonymy, antonymy, and hyponymy. The relation has an inverse: if W m is a meronym of W h , then W h is said to be a holonym of W m .�
�Meronyms are distinguishing features that
hyponyms can inherit. Consequently, meronymy and hyponymy become intertwined in
complex ways.�
�Although the connections may appear complex
when dissected in this manner, they are rapidly deployed in language
comprehension. comprehension. For example, most people do not even notice the
inferences required to establish a connection between the following sentences:
It was a canary. The beak was injured.�
�It has
been said that distinguishing features are introduced into noun hierarchies
primarily at the level of basic concepts; some claims have been made that
meronymy is particularly important for defining basic terms (Tversky and
Hemenway, 1984).�
�The ��part
of�� relation is often compared to the ��kind of�� relation: both are
asymmetric and (with reservations) transitive, and can relate terms
hierarchically (Miller and Johnson-Laird, 1976).�
�� it sounds odd to say ��The house has a
handle�� or ��The handle is a part of the house.�� Winston, Chaffin, and
Hermann (1987) take such failures of transitivity to indicate that different
part-whole relations are involved in the two cases. For example, ��The branch
is a part of the tree�� and ��The tree is a part of a forest�� do not imply
that ��The branch is a part of the forest�� because the branch/tree relation is
not the same as the tree/forest relation.�
�Such
observations raise questions about how many different ��part of�� relations
there are. Winston et al. (1987) differentiate six types of meronyms:
component-object (branch/tree), member-collection (tree/forest),
portion-mass (slice/cake), stuff-object (aluminum/airplane),
feature-activity (paying/shopping), and place-area (Princeton/New
Jersey). Chaffin, Hermann, and Winston (1988) add a seventh: phase-process (adolescence/growing
up). Meronymy is obviously a complex semantic relation�or set of relations.
Only three of these types of meronymy are coded
in WordNet:
Wm #p� Wh indicates
that Wm is a
component part of Wh;
Wm #m� Wh indicates that Wm is a member of Wh; and
Wm #s� Wh indicates that Wm is the stuff that Wh is made from.
Of these three, the �is a component of�
relation �#p� is by far the most frequent.�
�For commonsense purposes, the dissection of an
object terminates at the point where the parts no longer serve to distinguish
this object from others with which it might be confused. Knowing where to stop
requires commonsense knowledge of the contrasts that need to be drawn.�
�Tangled
hierarchies are rare when hyponymy is the semantic relation. In meronymic
hierarchies, on the other hand, it is common; point, for example, is a meronym
of arrow, awl, dagger, fishhook, harpoon, icepick, knife, needle, pencil, pin,
sword, tine; handle has an even greater variety of holonyms. Since the points
and handles involved are so different from one holonym to the next, it is
remarkable that this situation causes as little confusion as it does.�
�A functional
feature of a nominal concept is intended to be a description of something that
instances of the concept normally do, or that is normally done with or to
them.�
�the uses
to which a thing is normally put are a central part of a person�s conception of
that thing.�
�There are also linguistic reasons to assume
that a thing�s function is a feature of its meaning. Consider the problem of
defining the adjective good. A good pencil is one that writes easily, a good
knife is one that cuts well, a good paint job is one that covers completely, a
good light is one that illuminates brightly, and so on. � It is unthinkable
that all of these different meanings should be listed in a dictionary entry for
good.�
�One solution is to define (one sense of) good
as �performs well the function that its head noun is intended to perform�
(Katz, 1964).�
�In terms
of the present approach to lexical semantics, functional information should be
included by pointers to verb concepts, just as attributes are included by
pointers to adjective concepts. In many cases, however, there is no single verb
that expresses the function. And in cases where there is a single verb, it can
be circular. For example, if the noun hammer is defined by a pointer to the
verb hammer, both concepts are left in need of definition. More appropriately,
the noun hammer should point to the verb pound ��
�however: what is the function of apple or
cat?�
�Although
functional pointers from nouns to verbs have not yet been implemented in
WordNet, the hyponymic hierarchy itself reflects function strongly. For
example, a term like weapon demands a functional definition, yet
hyponyms of weapon�gun, sword, club, etc.�are specific kinds of
things with familiar structures (Wierzbicka, 1984). Indeed, many tangles in the
noun hierarchy result from the competing demands of structure and function.
Particularly among the human artifacts there are things that have been created
for a purpose; they are defined both by structure and use, and consequently
earn double superordinates.�
�The
strongest psycholinguistic indication that two words are antonyms is that each
is given on a word association test as the most common response to the other.�
�Semantic
opposition is not a fundamental organizing relation between nouns, but it does
exist and so merits its own representation in WordNet.
For example, the synsets for man and woman
would contain:
{[man, woman,!], person,@...(a male person) }
{[woman, man,!], person,@...(a female person) }
where the symmetric relation of antonymy is represented
by the �!� pointer, and square brackets indicate that antonymy is a lexical
relation between words, rather than a semantic relation between concepts.�
�When all
three kinds of semantic relations�hyponymy, meronymy, and antonymy�are
included, the result is a highly interconnected network of nouns. A graphical
representation of a fragment of the noun network is shown in Figure 2. There is
enough structure to hold each lexical concept in its appropriate place relative
to the others, yet there is enough flexibility for the network to grow and
change with learning.
Abstract
I would like to pose a set of fundamental questions regarding the
constraints we can place on the structure of our concepts,particularly as
revealed through language.I will outline a methodology for the construction of
ontological type based on the dual concerns of capturing linguistic generalizations
and satisfying metaphysical considerations.I discuss what �kinds of things� there
are, as reflected in the models of semantics we adopt for our linguistic
theories. I argue that the flat and relatively homogeneous typing models coming
out of classic Montague Grammar are grossly inadequate to the task of modelling
and describing language and its meaning. I outline aspects of a semantic theory
(Generative Lexicon) employing a ranking of types. I distinguish furst between
natural (simple) type and functional types,and then motivate the use of complex
type (dot objects)to model objects with multiple and interdependent
denotations.This approach will be called the Principle of Type Ordering. I will explore what the top lattice
structures are within this model, and how these constructions relate to more
classic issues in syntactic mapping from meaning.
At the time
of Miller�s writing, derivational morphology had not been included, and it
doesn�t appear to have been since then either.
does
Wordnet incorporate derivational morphology yet???
am dubious
about Collins & Quillian�s results about the reaction times of questions
moving up/down different levels of superordinacy(sp???) � if they�d asked if a
rat has legs, people would have had little trouble, but it�s the fact that we
don�t think of birds in general as having skin
what exactly is he trying to do???
what�s Montague Grammar???
telic???